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Abstract 

Background: In Vitis vinifera L, domestication induced a dramatic change in flower morphology: the wild sylvestris 
subspecies is dioecious while hermaphroditism is largely predominant in the domesticated subsp. V. v. vinifera. The 
characterisation of polymorphisms in genes underlying the sex-determining chromosomal region may help clarify 
the history of domestication in grapevine and the evolution of sex chromosomes in plants. In the genus Vitis, sex 
determination is putatively controlled by one major locus with three alleles, male M, hermaphrodite H and female F, 
with an allelic dominance M> l-i> F. Previous genetic studies located the sex locus on chromosome 2. We used DNA 
polymorphisms of geographically diverse V. vinifera genotypes to confirm the position of this locus, to characterise the 
genetic diversity and traces of selection in candidate genes, and to explore the origin of hermaphroditism. 

Results: In V. v. syivestris, a sex-determining region of 154.8 kb, also present in other Vitis species, spans less than 1% of 
chromosome 2. It displays haplotype diversity, linkage disequilibrium and differentiation that typically correspond to a 
small XY sex-determining region with XY males and XX females. In male alleles, traces of purifying selection were found 
for a treliaiose pliosptiatase, an exostosin and a WRKY transcription factor, with strikingly low polymorphism levels 
between distant geographic regions. Both diversity and network analysis revealed that H alleles are more closely 
related to M than to F alleles. 

Conclusions: Hermaphrodite alleles appear to derive from male alleles of wild grapevines, with successive 
recombination events allowing import of diversity from the X into the Y chromosomal region and slowing down the 
expansion of the region into a full heteromorphic chromosome. Our data are consistent with multiple domestication 
events and show traces of introgression from other Asian Vitis species into the cultivated grapevine gene pool. 
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Background 

The wild grapevine, Vitis vinifera L. subsp. sylvestris, is 
the wild ancestor of the domesticated grapevine V. v. 
vinifera [1,2], cultivated for wine and table grape pro- 
duction [3]. The genus Vitis, a monophyletic taxon of 
the family Vitaceae [4,5], includes approximately sixty 
species present mainly in Asia and America, all of 
which -except the domesticated grapevine- are dioe- 
cious (male and female flowers borne on different 
plants) [6,7]. During grapevine domestication, flower 
reproductive morphology has incurred radical modifi- 
cations, with the change from dioecy to hermaphrodit- 
ism in domesticated grapevines [8]. The geographic origin 
of hermaphroditism development in the domesticated 
grapevine is still not elucidated, nor is it known whether it 
occurred during primary [1,9] and/or secondary domesti- 
cation events believed to have occurred in geographically 
distinct areas around the Mediterranean [10,11]. 

Sex expression in Vitis flower is thought to be con- 
trolled by a major locus with three alleles, male M, 
hermaphrodite H and female F, with zn M> H> F allelic 
dominance [6,7,12-14]. Several genetic maps based on 
interspecific crosses have confirmed that sex determin- 
ism in the genus Vitis is under the control of a single 
major genomic region located on chromosome 2, close 
to the SSR marker WIB23 [15-17]. Recently, a complex 
interspecific cross {V. vinifera x [V. riparia x V. cinerea]) 
was used by Fechter et al. [18] to narrow the location of 
the sex locus to a 143 kb genomic region located be- 
tween positions 4.907.434 and 5.037.597 bp of chromo- 
some 2 [18] on the physical map of the V. vinifera 
reference genome sequence (PN40024 12x.O version 
[19]). So far, the co-localisation on chromosome 2 of the 
sex locus in V. vinifera subsp. vinifera has been confirmed 
only in the genetic map of one intra-specific cross [20], with 
a recombination distance of 0.4 cM from the nearest gen- 
etic marker (VVIB23). Moreover, in the V. v. sylvestris sub- 
species, the sex locus localisation remains to be confirmed. 

The evolution of proper sex chromosomes is quite rare 
in plants: indeed, approx. 40 species of flowering plants 
are currently known to have developed sex chromosomes 
and among them, half have heteromorphic sex chromo- 
somes [21]. A sex chromosome may start to develop in di- 
oecious species through the suppression of recombination 
between male- and female-sterile mutations with comple- 
mentary dominance in close proximity on a chromosome 
[22]. Then, this sex-determination region would gradually 
grow in size, increasingly incorporating sex-linked genes 
and eventually evolving into heteromorphic sex chromo- 
somes [21,22]. Some of the processes involved in sex 
chromosome evolution, as the suppression of genetic re- 
combination or the genetic degeneration of the Y chromo- 
some, are not well understood and only the study of the 
sex-determining systems on different species and at 



different steps of evolution could provide some answers 
[23]. While the sex determination locus in Vitis species 
was mainly studied to develop genetic markers for early 
sexing for breeding purposes [18,20], the work of Fechter 
et al. [18], evidencing a small sex-determination region, 
suggests that Vitis species could be good candidates to 
study the early steps of sex chromosome evolution. 

In the present study, we explore the sequence polymor- 
phisms near the sex locus in a genetically and geographic- 
ally diverse panel of wild and domesticated grapevines, with 
the objectives to: i) confirm the position and boundaries of 
the sex locus in V. vinifera subsp. sylvestris; ii) characterise 
the sex region in terms of linkage disequilibrium, genetic 
diversity, selection signature and candidate genes; and iii) 
use this information to explore the geographic and genetic 
origin of hermaphroditism in domesticated grapevine. 

Since wild grapevines carry the ancestral form of the 
sex locus from which the domesticated grapevine herm- 
aphroditism derived, we first mapped sequence polymor- 
phisms linked to the sex trait in Vitis vinifera subsp. 
sylvestris. Then, we compared the polymorphisms linked 
to the sex trait in diverse wild and domesticated grape- 
vine populations to study the origin of hermaphroditism 
in domesticated grapevines. 

Methods 

Plant material and phenotypic trait data 

The plant material consisted of 73 wild (39 females and 34 
males) and 39 hermaphrodite domesticated grapevines 
(Additional file 1). These grapevines were chosen among 
139 wild genotypes and 2.323 domesticated genotypes [24] 
to maximise both genetic diversity and geographic repre- 
sentation. Three genotypes from other species were also 
considered to represent genetic variation in the subgenus 
Vitis: V. balansaeana, V. coignetiae and V. monticola [25]. 
The grapevines were sampled either in natural populations 
or from the French National Grapevine Germplasm Col- 
lection (INRA, Domaine de Vassal, France; http://wwwl. 
montpellier.inra.fr/vassal/). The genotypes considered var- 
ied according to the genetic analyses (Additional file 1). 
Sex phenotypes (male, female or hermaphrodite) were 
evaluated by observations of flower morphology repeated 
over several years, and coded according to the Inter- 
national Organization of Vine and Wine descriptors (code 
number OIV-151 [26]). 

DNA extraction 

DNA was extracted from 150 mg of leaves according 
to the Dneasy Plant Mini Kit (Qiagen) instructions 
with 1% Polyvinylpyrrolidone (PVP 40.000) and 1% of 
|3-mercaptoethanol added to the buffer API to elimin- 
ate polyphenols, strong inhibitors of in-vitro enzym- 
atic reactions abundantly present in the crude grape 
cell lysate. 
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Amplicons sequencing 

Several studies located the sex locus in Vitis close to the 
SSR marker WIB23 on chromosome 2 [15-17,20,27]. In 
addition, we preliminary confirmed this locus in Vitis vi- 
nifera subsp. sylvestris, using 11 SSR markers segregating 
in several intra-specific crosses resulting from open- 
pollination (data not shown). Using this information, we 
designed 11 amplicons to cover a region between posi- 
tions 4.781.551 bp and 5.037.597 bp of chromosome 2 
(PN40024 grapevine genome reference sequence, ver- 
sion 12x.O [19]; Table 1, Additional file 2 for primer se- 
quences). This region covers both the VVIB23 SSR 
marker and the 143 kb region as defined by Fechter 
et al. [18] (Figure 1). We did not extend the coverage 
further downstream as we found that the SSR marker 
VMC3blO (position 5.057.413 bp) was not associated 
with sex segregation in our wild grapevine mapping 
populations (data not shown). 

According to Fechter et al. [18], the 143 kb region of 
chromosome 2 (12x.O version) between 4,907,434 and 
5,050,616 bp corresponds to the female allele of the 
hermaphroditic Pinot Noir 40024, while the slightly dif- 
ferent hermaphrodite allele is located on the unassigned 
scaffold_233 (chromosome UnRandom of the 12x.O). 
The 12x.O scaffold_233 is coUinear with the chromosome 
2 of the 8x grape genome reference sequence [19]; both 
these assemblies display two regions which are absent 
from the chromosome 2 assembly of the 12x.O reference 
sequence version: a region between the 3-Oxoacyl syn- 
thase III C terminal (KASIII) and the PLATZ transcriptor 
factor, and the adenine phosphoribosyl transferase (APT3) 
region [18]. The APT3 distinguishes female individuals 
from male and hermaphroditic ones [18]. A gene, the 
phosphatidic phosphatase 2 (PAP2), is not predicted by 



the Gaze annotation of the 12x.O reference sequence ver- 
sion but it is annotated by the Gaze annotation of the 8x 
reference sequence version and confirmed by Fechter 
et al. [18] on the 12x.O reference sequence version. 

For our work, eight primer pairs out of the eleven 
could thus be designed using the Gaze annotation of the 
12x.O reference sequence version (Table 1, Figure 1). A 
primer pair (VSWOll) was developed in the PAP2 gene 
using the Gaze annotation of the 8x reference sequence 
version (Table 1). Another primer pair (VSWOlO) was 
specifically developed to cover the region of the putative 
APT3 distinguishing female individuals from male and 
hermaphroditic ones [18]. A last amplicon (VSVV008) 
was designed to amplify a gene present in the region be- 
tween the KASSIII and the PLATZ transcriptor on the 
12x.O scaffold_233; the predicted protein of this gene 
blasts with an Ethylene Overproducer-like 1 gene (ETOl, 
blastx E-value = 4e-83). For the ETOl and APT3 ampli- 
cons, the positions on the grape genome physical maps 
were estimated based upon a manual realignment of the 
unassigned scaffold_233 (chromosome UnRandom of 
the 12x.O) and the 8x reference sequence version re- 
spectively, on the chromosome 2 of the 12x.O reference 
sequence version. As a consequence, in our work the 
12x.O positions of these two amplicons are approximate 
(Table 1). 

All primer pairs were designed using the PrimerS soft- 
ware V.0.4.0 [28,29] so as to amplify stretches between 
600 and 1.300 bp and cover a part of the promoter and 
the first exons and introns [28,29]. Thermocycling con- 
sisted of an initial stringent cycle (94°C for 3 minutes 
followed by 12 cycles of 94°C for 30 seconds, from 65 to 
56°C decreasing by 0.70°C at each cycle for 45 seconds, 
72°C for 120 seconds) and additional 25 cycles of 94°C for 



Table 1 Characteristics of the amplicons used in this study to cover the sex locus and its edges 



Amplicon name 


Position 




Amplicon size 




Gene annotation* 










Name 


Putative function 


VSWOOl 


4781551 


- 4782603 


1053 


GSVIVT01 00491 6001 


Esterase/lipase/tliioesterase family protein 


VSW002 


4822617 


- 4824068 


1452 


GSVIVT01 001 263001 


SAUR family protein 


VSW003 


4850582 


- 4851997 


1416 


GSVIVT01 001 267001 


Pentapeptide repeat protein 


VSW004 


4861475 


- 4862891 


1417 


GSVIVrOI 001 269001 


Yabby14 protein 


VSW005 


4883461 


-4884818 


1358 


GSVIVT01 001 272001 


Soluble acid invertase 


VSW006 


4900275 


-4901493 


1219 


GSVIVT01 001 275001 


Trehalose-6-phosphate phosphatase (TPP) 


VSW007 


4921838 


- 4923352 


1515 


GSVIVT01 001 277001 


Exostosin family protein 


VSW008+ 


4953195 


- 4954179** 


984 


GSVIVT01 004781 001 


Ethylene Overproducer-like 1 (ETOl) 


VSW009 


4989467 


- 4990268 


802 


GSVIVrOI 001 286001 


WRKY transcription factor 21 


VSWOlO* 


5009549- 


- 5010222** 


673 


GSVIVr000073 10001 


Adenine phosphoribosyltransferase (APT3) 


VSWOll^ 


5036645 


- 5037597 


953 


GSVIVT000073 12001 


Phosphatidic acid phosphatase 2 (PAP2) 



*Gaze annotation, **Approximative values. ^PN40024 reference sequence, 12x.O version, amplicon position 16.072.323-16.073.307, Scaffold_233, chromosome 
UnRandom; *PN40024 reference sequence, 8x version, amplicon position 5.192.572-5.193.382, scaffold 187, chromosome 2; ^Primers developed in the gene 
predicted using the 8x Gaze annotation and confirmed by Fechter et al. [18] on the 12x.O reference sequence version. 
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4 800 



4 850 



4 900 



4 950 



5 000 



VSW004 
VVIB23 

VSW005 

VSVV006 

VSW007 

VSVVOOB 

VSW0C9 
VSWOlO 
VSVVOll 



Trehalose-6-phosphate phosphatase 
(TPP) 

- Unknown function 

Exostosin family protein 

3-Oxoacyl-(acyl-carrier-protein) 
synthase III (KASIM) 

• Ethylene overproducer like 1 (ETOl) 

" PLATZ transription factor 

■ Flaving-containing monooxygenase 



1 J *■ Flaving-containing monooxygenase 

i \ " Flaving-containing monooxygenase 

u -Flaving-containing monooxygenase 
\ \ 

IT- ' * 

' > ^ Unknown function 



■ WRKY transcription factor 21 



Adenine phosphoribosyltransferase 
(APT3, Fetcher eta/. [18]) 



4898034-4901145 GSVI\/T01001275001 



4914168-4914984 GSVIVT01001276001 



4921780-4923609 GSVIVT01001277001 



4923752-4935337 GSVI\/T01001278001 



4953998-4959471 GSVIVT01004781001 



4948964-4950534 GSVIVT01001279001 



4951821-4956004 GSVIVT01001280001 



4957741-4960852 GSVIVT01001281001 



4962612-4965728 GSVI\/T01001282001 



4974557-4978139 GSVIVT01001284001 



4983356-4986675 GSVIVT01001285001 



4989461-4989778 GSVIVT01001286001 



5009498-5010308 GSVIVT00007310001 



Phosphatidic acid phosphatase 2 (PAP2) 5036698-5037504 GSVIVT00007320001 

Figure 1 Amplicon position In tlie sex locus and Its boundaries on chromosome 2 of the 12x.O reference sequence version, a) WIB23 
SSR marker (liglit blue rectangle) and amplicon position (red ellipses); b) Amplicon position and gene Gaze annotation in the 143 kb sex locus 
defined by Fechter et al. [18]. The 12x.O annotated genes version are represented in dark blue and our amplicons in red. For the APT3 and the 
ETOl gene, we used the synteny between the chromosome 2 of the 8X reference sequence version, the unassembled scaffold_233 of the 12x.O 
reference sequence version, and the BAG sequencing maps of V. riparia and V. cinerea [18] to estimate their relative position on chromosome 2, 
12X.0 version (see Methods). The phosphatidic phosphatase 2 (PAP2), is not predicted by the Gaze annotation of the 12x.O reference sequence version 
but it is annotated by the Gaze annotation of the 8x reference sequence version and confirmed by Fechter et al. [18] on the 12x.O version. 



30 seconds, 56°C for 45 seconds, 72°C for 90-120 seconds. 
Sequencing was performed on PGR products purified 
using the AMPure' kit (Agencourf , MA, USA); BigDye' 
Terminator v3.1 Cycle Sequencing Kit (Applied BioSistem, 
CA, USA) was used following the standard protocol and 
reaction products were purified with the CleanSEQ" kit 
(Agencourt) and read on a 3130x1 Genetic Analyzer (Ap- 
plied BioSystems). Raw sequence fOes (ABl format) were 
imported, aligned and trimmed using the Staden software 
v.2.0.0 [30]; SNP calling was carried out manually using 
the Staden interface. Then, fasta files were exported and 
subsequently analysed in other softwares and pipelines. 

Identification of sequence polymorphisms linked to the 
sex trait 

Phenotypic sex inheritance in wOd grapevines produces 
only male and female variants, with a ratio near to 1:1 in 
adult populations (even if some variation in sex phenotypes 
have been observed [13,26], in our sample only two morphs 
were found, M and F). The most parsimonious hypothesis 
we could make on sex inheritance in grape, based on 



previous observations, preliminary data analysis, and litera- 
ture survey [6,7,17,18,20], was that of a XY system, where, 
at the sex locus, the female is homozygous (XX) and the 
male is heterozygous (XY). 

To map the sex locus on the genome, we first used a gen- 
etic association approach, looking for correlations between 
sex flower phenotypes and sequence polymorphisms in a 
panel of diverse wild genotypes from different geographic 
provenances (Additional file 1). However, the use of general 
or mixed linear models searching for association resulted in 
too many false positives (SNP that were correlated to sex 
but explained only a portion of the phenotypes). Thus, we 
used an approach similar to Siegismund [31], using Fisher 
tests to compare, for each polymorphism and for male and 
female wild grapevines separately, the expected and ob- 
served proportions of homozygous and heterozygous geno- 
types. The expected proportions were assumed to follow 
the Hardy- Weinberg law and were calculated from the al- 
lele frequencies observed in the entire population (sum of 
male and female individuals). The observed counts were 
the number of homozygous and heterozygous genotypes 
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actually recorded in male and female grapevines. Indels 
were coded as present/absent (Additional file 3). Fisher 
tests were calculated with the fisher, test function of the R 
statistical software [32]. We only considered sequence poly- 
morphisms with less than 20% missing data and with a 
minimum allele frequency in the sample higher than 5%. A 
test was considered significant when the probability of devi- 
ation from the null hypothesis was inferior to a 0.05 P- 
value threshold adjusted by a Bonferroni correction for 
multiple hypotheses testing (0.05/n with n corresponding 
to the total number of studied polymorphisms). 

Linkage disequilibrium in the sex region 

To explore linkage disequilibrium between and within 
amplicons covering the sex region, we used the Meas- 
ure.R2VS{) function in the R package LDcorSV [33]. 
r^VS is the square of each pairwise correlation corrected 
by both the relatedness and genetic structure of the sam- 
ple [33]. The sample considered here was composed of 
18 male and 18 female individuals (Additional file 1). 
These 36 specimens were chosen among those with the 
least missing data, eliminating the most closely related 
individuals and equilibrating their geographic represen- 
tation. The genetic structure matrix was calculated from 
a dataset of 20 SSRs [24] of all the wild genotypes in this 
study, using STRUCTURE software [34]. We used the 
model with uncorrected allele frequencies, admixture, 
and no prior population information, previously showed 
to be pertinent in grapevine [35]. Ten STRUCTURE 
runs (each with 5 x 10^ iterations and 5 x 10^ replicates) 
for each K-level were obtained and compared to estimate 
group assignation stability. The most probable number 
of sub-populations was inferred based on both the simi- 
larity pattern among the 10 STRUCTURE replicates and 
Evanno's Aks statistics [36]. The kinship matrix was ob- 
tained using ML-Relate software [37] with the same SSR 
markers and genotypes as above. 

Diversity in M, F and H haplotypes and signature 
of selection 

To compare the diversity of male, female and hermaph- 
rodite alleles at the significant sex-linked amplicons (see 
Additional file 1 for the genotypes considered), the haplo- 
types were reconstructed using PHASE v2.1 with default 
parameter values [38,39]. The attribution of individual 
haplotypes to the M, F and H groups (called hereafter hap- 
logroups) were carried out with the help of haplotype trees 
(Additional file 4) built with a maximum likelihood 
method (PhyML 3.0 [40]) implemented in SeaView v4.3.3 
[41] and based on the Generalised Time-Reversible (GTR) 
model [42]. 

Genetic diversity in M, F and H haplotypes was evalu- 
ated with the following statistics: number of haplotypes 
(Nh), number of segregating sites (S), haplotype diversity 



(H) and nucleotide diversity (n). In order to detect a sig- 
nature of selection in the sex region, Tajima's D [43] and 
Fu and Li's D* [44] statistics were calculated with the 
DnaSP v5 software [45] separately for the male, female 
and hermaphrodite haplogroups. To confirm traces of se- 
lection detected on the male haplogroups with the Tajima's 
D and the Fu and Li's D* tests, the E statistics and the DH 
test [46] were computed using the male haplotype of V. 
balanseana as an outgroup (Table 2). 

Finally, we evaluated the intraspecific genetic differen- 
tiation between male, female and hermaphrodite hap- 
logroups, and the interspecific differentiation between 
V. V sylvestris and Vitis species haplotypes, using the Fst 
statistics [47,48] with DnaSP v5 software as well. The 
Vitis species used for this statistics were V. balanseana, 
V. monticola and V. coignetiae. 

Origin of the H haplotypes 

Combining the haplotypes of the four sex-linked ampli- 
cons, the M, F and H macrohaplotypes were recon- 
structed. PHASE v2.1 was run again using a 100 burn-in 
period with 100 iterations with a thinning interval of 1 
and 10 repeats. The algorithm was run several times, val- 
idating convergence. Then, to understand the origin of H 
haplotypes in the domesticated grapevine, a network ana- 
lysis was carried out on the F, M and H macrohaplotypes 
using the median-joining method as described in Bandelt 
et al. [49] and implemented in Network v4.6.1.1. [50]. A 
Star Contraction was run before running the network 
calculation. 

Finally, the relationship between the network distances 
(in number of mutations) of the H haplotypes from the 
M haplogroup, and the geographic origin, grape use 
(table, wine or both), degree of domestication (ancient 

Table 2 Allocation of 0, 1 or 2 female haplotypes (F) to 
the hermaphrodite, male and female genotypes, 
according to the maximum likelihood trees, for the four 
sex linked amplicons 



Genotype 


VSW006 


VSVV007 


VSW009 


VSVVOli 


Hermaphrodite 


n = 22 


n=22 


n=22 


n=21 


0 liaplotype F 


1 


0 


1 


9 


1 liaplotype F 


21 


22 


19 


8 


2 liaplotypes F 


0 


0 


2 


4 


Male 


n=22 


n = 22 


n=22 


n=18 


0 liaplotype F 


0 


0 


0 


0 


1 liaplotype F 


22 


22 


22 


18 


2 liaplotypes F 


0 


0 


0 


0 


Female 


n = 24 


n=24 


n=24 


n=22 


0 liaplotype F 


0 


0 


0 


0 


1 liaplotype F 


0 


0 


0 


0 


2 liaplotypes F 


24 


24 


24 


22 
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or modern cultivars [51]) and the genetic structure an- 
cestry of the domesticated grapevines [35] were explored 
using an ANOVA. 



Results 

Sequence polymorphisms linked to the sex trait 

Eleven amplicons representing 9.523 bp in total and de- 
signed to partly amplify gene sequences were chosen to 
cover the sex locus and its boundaries [18,20]. Sequen- 
cing these 11 amplicons on a sample of 65 genetically 
and geographically diverse wild genotypes (31 males 
and 34 females, Additional file 1, [GenBank: KJ575622- 
KJ57662]), allowed the identification of 146 poly- 
morphic sites (Additional file 3): 137 SNPs and 9 indels. 
Thirty-six SNPs were located in introns and twenty in 
exons, among which ten were non-synonymous. The al- 
lele frequencies of 51 and 64 polymorphisms in female 
and male genotypes respectively were found signifi- 
cantly deviating from the Hardy- Weinberg proportions 
(Figure 2b). These significant polymorphisms were mainly 
found in VSW006, VSW007, VSW009 and VSWOlO 
(87,04% of the significant polymorphisms in females and 
90.60% in males). 



Among the significant polymorphisms, 28 perfectly 
fitted the XY sex determination model. For these poly- 
morphisms, 100% of the male genotypes were hetero- 
zygous and 100% of the female genotypes were 
homozygous for the most frequent allele, i.e. for ex- 
ample males were A/T and females were A/A but never 
T/T (Figure 2c, Additional file 3). In hermaphrodite 
domesticated genotypes, these same polymorphisms 
were in the majority of cases in a heterozygous state 
(Additional file 5). These 28 polymorphisms, perfectly 
fitting the XY model, were only found in the VSVV006, 
VSVV007 and VSVV009 amplicons and 3 of them re- 
sulted in non-synonymous amino acid changes (38th, 
61th and 66th polymorphism in VSVV006 or 
VSVV007, Additional file 3). 

Moreover, 18 significant polymorphisms in VSVV006, 
VSW007, VSVV009 and VSVVOlO were only slightly 
deviating from the XY sex determination model, with all 
female genotypes homozygous for the most frequent al- 
lele and one or two non-heterozygous males (Additional 
file 3). For example, for the polymorphism 126 (crosses 
in Figure 2b, c) corresponding to the sex-linked indel in 
the second intron of the APT3 gene [18], all female were 
homozygous without the indel while 92% of male were 
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Figure 2 Polymorphisms in the sex region, a) Amplicon position along the sex locus on chromosome 2. b) Fisher test probabilities of 
deviation from the expected Hardy-Weinberg genotype proportions in wild grapevines (31 males and 34 females). The significant Fisher tests are 
represented by dots above the red dashed line, which is the log-transformed Bonferroni threshold (-log(0.05/146) = 3.47). Red dots represent the 
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of heterozygous genotypes. The heterozygosity proportions are represented by red dots in the 34 females and by blue dots in the 31 males. 
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heterozygous (23 heterozygous, one homozygous with 
the indel and one homozygous without it) (Additional 
file 3). In the VSVV008 amplicon, only one SNP was 
found slightly deviating from the XY sex determination 
model (Figure 2b and Additional file 3). 

By contrast, and although few of them were found sig- 
nificantly departing from Hardy- Weinberg proportions 
(Fisher test), the polymorphisms found in VSW002, 
VSW003, VSW004 and VSW005, largely deviated from 
the XY model, particularly in male genotypes (Figure 2 
and Additional file 3). 

In summary, 46 significant polymorphisms in the 
VSW006, VSW007, VSW009 and VSWOlO amplicons 
fitted either strictly (28) or closely (18) the XY sex- 
determination model. These results allowed us to de- 
fine the boundaries of the sex locus at the positions 
4.884.818 and 5.036.645 on chromosome 2 of the 
PN40024 physical map (12x.O version). This 151,83 kb 
region, externally delimited by the gene fragments 
VSVV005 and VSVVOll contains 13 candidate genes 
(Figure 1 and Additional file 6). 

Linkage disequilibrium in the sex region 

The intra- and inter-amplicon linkage disequilibrium 
(LD) was estimated on a sub-sample of 18 male and 18 
female wild grapevines (Additional file 1), by calculating 
the pairwise square correlation coefficient r^VS [33], cor- 
recting for the structure and kinship of the sample. Only 



sequence polymorphisms with less than 20% missing data 
and with a 0.2 minor allele frequency were analysed. At 
these thresholds, no polymorphisms were retained in the 
VSWOOl fragment. 

The highest values of LD were found within and be- 
tween the four sex-linked fragments (Figure 3). The mean 
LD for all pairwise comparisons for the four sex-linked 
fragment was r^VS = 0.72 for a total physical length of 
109.76 kb. The maximum mean intra-amplicon LD was 
r^VS = 0.84 over 374 bp for VSWOlO and the minimum 
was r^VS = 0.63 over 504 bp for VSW009. The maximum 
inter-amplicon LD was r^VS = 0.81 between VSVY006 
and VSWOlO (109.39 kb) and the minimum was 
0.63 in between VSW007 and VSW009 (67.84 kb). The 
fragment VSW008 (only weakly linked to sex) presented 
a significant but lower LD with the sex-linked fragment 
{r^VS = 0.31). 

Diversity of the M, F and H haplotypes and signature 
of selection 

The M, F and H haplotypes for the four sex-associated 
amplicons (VSW006, VSW007, VSW009 and VSWOlO) 
were assigned using maximum likelihood haplotype trees. 
According to the XY model and the rules of dominance de- 
scribed for Vitis {M>H>F [6,7,12-14]), the haplogroup 
containing haplotypes from female, male and hermaphro- 
dite genotypes was designated as the female F haplogroup 
(Additional file 4); it is supposed to contain the F 
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haplotypes of FF females, MF males and HF hermaphro- 
dites genotypes. By difference, the alternate haplotypes 
found in male and hermaphrodite genotypes but not 
present in the F haplogroup, were considered as the M and 
the H haplotypes respectively (Additional file 4). 

For the wild female and male genotypes, the number of 
F haplotypes found in the female haplotype group trees 
was consistent with the XY sex model (one F haplotype in 



male genotypes and two in females; Table 3). However, 
some hermaphrodite genotypes presented, for one or two 
amplicons only (never for the four amplicons simultan- 
eously) either no or two F haplotypes. This departure from 
the sex model was particularly pronounced in VSVYOIO. 

For diversity parameters calculation and the estimation 
of selection signature, we differentiated the F haplotypes 
of the hermaphrodite domesticated genotypes from the 



Table 3 Diversity statistics for wild male/female, cultivated hermaphrodite and female haplotypes groups 





VSVV006 


VSVV007 


VSW009 


VSVV010 




nn bp 


849 bp 


690 bp 


498 bp 


Wild male haplotypes 










Effective 


22 


22 


22 


18 


S 


5 


1 


6 


12 


Nh 


3 


2 


3 


6 


H 


0.18 


0.09 


0.18 


0.72 


n 


0.00041 


0.0001 1 


0.00079 


0.00375 


Tajima's D 


-1.99 * 


-1.16 (ns) 


-2.07 * 


-1.71 (ns) 


Fu and Li's D* 


-2.91* 


-1.57 (ns) 


-3.23 ** 


-2.10 + 


Zeng et al.'s E 


-1.404* 


-0.866 (ns) 


-0.551 (ns) 


-0.334 (ns) 


DM test (p-value) 


0.148 (ns) 


0.331 (ns) 


0.023 ** 


0.035** 


Domesticated hermaphrodite haplotypes 










Effective 


22 


22 


20 


26 


S 


11 


2 


3 


1 1 


Nh 


6 


3 


4 


5 


H 


0.72 


0.26 


0.36 


0.46 


n 


0.00216 


0.00031 


0.00091 


0.00474 


Tajima's D 


-0.71 (ns) 


-1.18 (ns) 


-0.69 (ns) 


-0.60 (ns) 


Fu and Li's D* 


0.53 (ns) 


-0.63 (ns) 


-0.12 (ns) 


0 (ns) 


Wild female haplotypes 










Effective 


71 


71 


71 


62 


S 


13 


6 


26 


19 


Nh 


12 


7 


16 


17 


H 


0.69 


047 


0.86 


0.72 


n 


0.00136 


0.00170 


0.00526 


0.00744 


Tajima's D 


-1.24 (ns) 


0.38 (ns) 


-1.01 (ns) 


-0.26 (ns) 


Fu and Li's D* 


-1.81 (ns) 


0.24 (ns) 


0.15 (ns) 


1.28 (ns) 


Domesticated female haplotypes 










Effective 


21 


22 


20 


16 


S 


7 


4 


9 


16 


Nh 


6 


3 


8 


9 


H 


0.77 


0.26 


0.77 


0.86 


n 


0.00224 


0.00090 


0.00500 


0.01089 


Tajima's D 


0.89 (ns) 


-0.85 (ns) 


1.24 (ns) 


0.49 (ns) 


Fu and Li's D* 


0.66 (ns) 


1.10 (ns) 


0.86 (ns) 


0.93 (ns) 



S = number of segregating sites, Nh = number of different haplotypes, H = haplotype 
D*, Zeng et al.'s E and DH test : indicate a p-value < 0.01, "*" a p-value < 0.05, "+" 
were computed using the male haplotype of l^. balanseana as an outgroup. 



diversity and n = nucleotide diversity. For the Tajima's D values, Fu and Li's 
a p-value < 0.1 0 and (ns) non-significance. The E statistics and the DH test 
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F haplotypes of the male and the female wild genotypes, 
so as to detect different diversity or selection patterns be- 
tween the domesticated and the wild compartments. Except 
for VSWOlO, M haplogroups presented the lowest number 
of haplotypes (Nh), and the lowest level of haplotype (H) 
and nucleotide (it) diversity, revealing the predominant 



occurrence of one major haplotype, with a low number of 
SNPs in rare variants (Table 2). The extreme case was the 
VSW007 amplicon for which only two haplotypes were 
observed, differing by only one SNP over 849 bp (polymor- 
phisms n. 3 in Figure 4). On the other hand, in VSWOlO, 
the M haplogroups revealed a high haplotype diversity 
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equivalent to the domesticated and wild F haplogroups, 
and a higher n value than for other amplicons (Table 2). 
The F haplogroups of the wild and domesticated genotypes 
presented strikingly more numerous and diverse haplotypes 
than the M haplogroups. Overall, domesticated and wild F 
haplogroups presented similar diversity patterns. 

The H haplogroups showed an intermediate diversity 
pattern between the M and F haplogroups, but closer to 
the M haplogroups (Table 2). For VSWOlO, the H hap- 
logroup presented diversity patterns quite equivalent to 
that of M haplogroups, except for a lower haplotype 
diversity. 

To illustrate these findings, the haplotypes identified for 
each sex-linked amplicon are presented in Figure 4 (for 
genotype and geographic details see Additional file 7). 

This dataset shows that the three grapevine flower 
sexes, male, female and hermaphrodite, could be cor- 
rectly predicted in 97% of the genotypes of our geograph- 
ically representative V. vinifera sample, using a few SNPs, 
i.e. n. 4 to 7 of VSW007 and n. 8 of VSWOlO (identified 
by black circles respectively termed 1 and 2 in Figure 4a). 

Male haplogroups revealed significantly negative 
Tajima's D, and Fu & Li's D* values for VSVV006 and 
VSVV009 (Table 2). For VSWOlO, the Fu and Li's D* 
statistics were close to the significant threshold (0.10 > 
p-value > 0.05). For male haplogroups (Table 2), all ampli- 
con revealed negative E values, but only VSW006 showed 
a significant excess of low-frequency variants. The DH 
tests detected significantly positive selection on VSW009 
and VSWOlO. No other sex haplogroup showed signifi- 
cant signature of selection. 

The Fst values (Table 4) revealed a wide genetic distance 
between the M and F haplogroups for the four sex-linked 
amplicons, though less pronounced for VSWOlO. The H 
haplogroups were genetically closer to M than to F hap- 
logroups. For VSW007, the H and M haplogroups bore 
identical haplotypes, thus displaying a null distance. Com- 
paratively, slight genetic differences only were found be- 
tween the wild and the domesticated F haplogroups in 
VSW006, VSW009 and particularly VSWOlO. However, 
for VSW007, the wild and the domesticated populations of 
F haplogroups seem to be distinct. All genetic differentiation 



values were lower in VSWOlO, revealing that all sex 
haplogroups are less differentiated in this region. For 
the four amplicons, the intra-specific genetic distances 
between male (or hermaphrodites) and female haplogroups 
were largely superior to the interspecific genetic distance 
between Vitis sp. haplotypes (Table 4). 

Origin of the IH allele 

To determine the origin of the H allele, a network was 
built based on F, M and H macrohaplo types, combining 
information provided by the four sex-linked amplicons 
(Figure 5a). According to this haplotype network, where 
the distance between pairs of genotypes is proportional 
to the number of mutations between them, H macroha- 
plotypes were closer to the M ones than to the F macro- 
haplotypes. The network displayed two groups of H 
macrohaplotypes: the first {HI), at the edge of the net- 
work, was only composed of three domesticated grape- 
vines: cv. Tsolikouri (chTSO), cv. Ak ouzioum Tapapskii 
(chAKO) and cv. Sylvaner (chfSYLVA), while the second, 
H2 grouped all the others H macrohaplotypes of the do- 
mesticated hermaphrodite grapevines. The M macroha- 
plotypes of the wild male grapevines were located 
between the two H macrohaplotypes groups. However, 
one male wild macrohaplotype, Lambrusque Ul any nad 
Zitavou A07 (smUNZAO?) from Slovakia, displayed a 
macrohaplotype closer to the H2 macrohaplotypes than to 
the other M macrohaplotypes (Figure 5a). This grapevine 
displayed a VSW007 haplotype not found in other wild 
male grapevines, but found in two domesticated hermaph- 
rodite genotypes. Concerning the F macrohaplotypes, 3 
subgroups could be defined according to the occurrence 
of wild or domesticated macrohaplotypes (Figure 5a,b). 
The Fl group was composed by a majority of wild macro- 
haplotypes together with 4 cultivars: cv. Cabernet franc 
(chCAF), cv. Sylvaner (chjSYLVA), cv. Lignan (chfLN) and 
cv. Lameiro (chLAR). The F2 group contained mostly do- 
mesticated macrohaplotypes. In this group, some do- 
mesticated grapevines had two identical haplotypes 
allocated to the H haplogroup in the VSWOlO; the 
macrohaplotypes closest to the Fl and F3 groups are cv. 
Dattier noir (chDTN), cv. Muscat a petits grains blanc 



Table 4 Fst values between combinations of the four sex haplotype groups 


Haplotype groups 


Fst 










VSVV006 


VSW007 


VSW009 


VSWOlO 


Vith vinifera intraspecific comparaison 










Wild males vs. wild females 


0.95 


0.93 


0,88 


0.62 


Wild males vs. domesticated hermaphrodites 


0.62 


0.00 


0.61 


0.54 


Domesticated hermaphrodites vs. wild females 


0.90 


0.92 


0.86 


0.67 


Wild females vs. domesticated females 


0.17 


0.62 


0.16 


0.08 


Vitis sp. vs Vitis vinifera syivestris 


0.16 


0.04 


0.05 


0.19 



The Vitis species used for the interspecific statistics were V. bolanseana, V. monticolo and V. coignetiae. 
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(See figure on previous page.) 

Figure 5 Consensus network carried out on the F, M and H macrohaplotypes. Coloured circles regroup together identical haplotypes, with size 
proportional to their numbers. The distance between pairs of genotypes is proportional to the number of mutations between them, a) Pie colours indicate 
the proportion of phenotypic sex morphs within the group (see legend). Polygons regroup the sex macrohaplotypes; for example, the F haplogroup 
regroups the 2 female macrohaplotypes of the female genotypes plus the single F macrohaplotype of the males and of the hermaphrodites, b) Pie 
colours indicate the STRUCTURE group of Bacilieri et al. [35]. The shortened name of some hermaphrodite domesticated grapevines are indicated 
(Additional file 1) as an example. 



(chMUF), cv. Diagalves (chfDIAGA) and cv. Savagnin 
(chfSAVA77). A last group F3 grouped in approximately 
the same proportions wild and domesticated macrohaplo- 
types, among which cv. Portugais bleu (chfPORTBL), cv. 
Grenache (chfGRENA), cv. Ak ouzioum tpapskii (chAKO) 
and cv. Araklinos (chfARA). 

To better understand the origin of the H alleles, we ex- 
plored the relationship between the network distances of 
the H macrohaplotypes from the M ones (Figure 5b) and 
the geographic origin, use and degree of domestication 
[51] of the cultivated grapevines. None of these character- 
istics revealed a clear correlation with H macrohaplotypes 
positions in the phylogenic network. We then tried to 
match the network distances with the STRUCTURE 
groups defined in Bacilieri et al. [35]. This work, based on 
2.096 domesticated genotypes, has revealed four main 
genetic groups: a) wine cultivars from Western regions, 
b) table grape cultivars from Eastern Mediterranean, 
Caucasus, Middle and Far East countries, c) wine cultivars 
from the Balkans and East Europe, and d) a large group 
of cultivars with admixed genomes. Here, ANOVA ana- 
lysis revealed a weak tendency (r^ = 0.15, p = 0.09) for 
the Balkan and East Europe cultivars macrohaplotypes, 
as compared to wine Western cultivars, to be closer to 
the wild M macrohaplotypes. 

Similarly, although pointing to different "degrees of 
domestication", the 3 groups of F macrohaplotypes de- 
fined above did not show a clear geographic or genetic 
structure pattern that could explain group composition. 

The network position of the macrohaplotypes of the two 
female V. monticola, V. coignetiae and the male V. halan- 
saeana grapevines used as outgroups, were distributed co- 
herently to their sex phenotype: both macrohaplotypes in 
the F macrohaplogroups for the females, one in the F 
macrohaplogroups and one close to the M and H macro- 
haplogroup ones for the male. The closest domesticated 
macrohaplotypes to the V. balansaeana ones belonged to 
two Russian cultivars : cv. Assy I kara (chASS) and cv. Ak 
ouzioum tpapskii (chAKO) (Figure 5). 

Discussion 

Sex region location in Vitis vinifera subsp. sylvestris 

From a locus defined by previous works on inter-specific 
crosses [17,18], 11 genes were partially sequenced on a 
diverse set of male and female wild grapevines. Forty-six 
polymorphisms in four amplicons were found perfectly or 



strongly linked to flower sex, allowing to locate in V. v. 
subsp. sylvestris a sex locus of 151.8 kb on chromosome 2, 
in full agreement with the 143 kb sex region defined by 
Fechter et al. [18] on a Vitis interspecific cross. Our results 
corroborates the dominance of the M allele over the F al- 
lele, characteristic of a XY sex-determination model, co- 
herently with sex segregations in controlled crosses 
[7,12,14]. We also confirmed that the sex locus is situated 
downstream of SSR marker WIB23, while previous stud- 
ies, based on a lower marker density, had placed it up- 
stream [17,20]. 

Within the 151.8 kb sex region, the polymorphisms of 
the centrally located VSW008 amplicon associated only 
weakly with the sex trait, with one significant SNP only 
and no perfect M/F association. One hypothesis to explain 
this pattern may be that local recombination disrupted the 
association pattern in V. v. sylvestris. Unfortunately, in our 
work we were not able to unequivocally confirm the 
VSW008 position within the sex locus. Actually, the PCR 
primers for this amplicon were designed based on the syn- 
teny between several genome sequence assemblies: the 
chromosome 2 of the 8x grape genome, the putative 
hermaphrodite allele on the unassigned scaffold_233 
(12x.O) and the male V. cinerea BAC sequencing map [18], 
where VSVY008 is located as expected between VSW007 
and VSW009. According to this information, we expected 
that VSW008 would amplify only in males; however, in 
our V. sylvestris sample, it amplified indifferently of sex. 
Even if the sequence obtained or its PCR primers did not 
blast anywhere else in the genome some doubts still re- 
main about the true coordinates of VSW008; only new 
specifically designed experiments may help to definitely 
confirm the VSW008 position. 



Characterisation of the sex locus 

Over the four genes linked to sex, we found a strong LD, 
unprecedented in Vitis vinifera, with a mean PVS of 0.72 
over 109.76 kb. In V. vinifera, LD has been shown to decay 
rapidly: in more than 200 gene fragments, Lijavetsky et al. 
[52] observed an LD decay lower than 0.2 at around 
200 bp, a finding later confirmed through massive geno- 
typing by Myles et al. [53]. A larger LD in the sex locus, as 
compared to other genome regions, could be an indication 
of suppression of recombination, a feature typical of het- 
eromorphic XY-like chromosomal regions [23] . 
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The lowest values of haplotype diversity (H) were 
found in male haplotypes of wild grape, with the pre- 
dominant occurrence of one major haplotype, distributed 
without variation over largely diverse geographic origins, 
from Eastern to Western European, Caucasian and North 
African provenances. Hermaphrodite domesticated grapes 
displayed haplotype diversity values higher than that in 
wild males, while female haplotypes had the highest 
values, without notable differences between wild and 
domesticated pools. The large Est values between males 
and females confirm the clear genetic differentiation be- 
tween the M and the F haplotypes. The negative signifi- 
cant Tajima's D and the Eu and Li's D* values in M 
haplotypes of VSVV006 and VSVV009 indicate an ex- 
cess of rare polymorphisms, revealing purifying selec- 
tion. Using V. balanseana as outgroup, the Zeng et al. E 
statistics and DH test [52] confirmed this pattern for 
VSW009. For the VSW007 M haplotypes, these statistics 
were negative but not significant, probably because over 
its 849 bp length, we found only one segregating site. 
Such monomorphism may be a signal of stabilising se- 
lection, in particular because our grape samples origi- 
nated from very diverse geographic regions. Indeed, in 
grapevine, previous works evidenced a much higher 
variation rate, with an average of 1 SNP in 47-129 base 
pairs, according to the genome region and the popula- 
tion studied [52,54,55]. By contrast, the f haplotypes for 
the four sex-linked presented no significant traces of se- 
lection, suggesting that these alleles are evolving under 
a neutral model. 

Overall, the sex region presents traits typical of a 
small XY non-recombining region [21]. According to 
the commonly accepted model of sex chromosome 
evolution in plants, such a region can appear in dioe- 
cious species when recombination suppression occurs 
between two closely located male- and female-sterile 
mutations [22]. The f allele is expected to contain a re- 
cessive, "loss-of-function" type, male sterility mutation 
whereas the M allele would harbour a fully-functioning 
male fertility allele with, at a nearby locus, a dominant 
female sterility mutation [23]. In such a case, the M al- 
lele is expected to be constrained by selection against a 
recombination between the two sex-determining loci, 
since recombination may bring either total sterility, or re- 
version to the ancestral hermaphrodite state. The accumu- 
lation of insertions, inversions, repeated elements and 
chromosomal rearrangements in the X and the Y counter- 
parts [56] may add to this mechanism, impeding local 
chromosome pairing at meiosis. Indeed, in this locus, 
Fechter et al. [18] reported the presence of additional re- 
peated EMO elements and of a retrotransposon in the fe- 
male allele, both absent from the male allele. These 
structural differences may help repress local recombin- 
ation between M and E alleles. The suppression of the 



recombination may in turn be at the origin of the linkage 
disequilibrium, and it may as well explain part of the re- 
duction of diversity in M alleles. 

In this region, the weaker association with the sex 
trait in a distal (VSVVOlO) and, if accurately located, a 
central (VSVV008) genes could be a trace of some re- 
combination events, sufficient to break the association 
with the sex causal genes, but not ample enough to 
completely blur LD traces (Figures 2 and 3). Rare re- 
combination events could have prevented the evolution 
of this small sex region into a full sex chromosome in 
Vitis, although dioecy is supposed to have appeared in 
this taxon millions of years ago [57]. Finally, if the 
VSVV008 is well located in the sex locus, sex determin- 
ism in Vitis might be the result of two distinct sets of 
mutation in two linked gene regions, one including 
VSVV006 and VSVV007, and the other including 
VSVV009. As in Fragaria virginiana Mill. [58], the fe- 
male and male sterile mutations could be not completely 
linked allowing the appearance of neuter and hermaphro- 
dite individual. Some hermaphrodite grapevines have been 
already observed in natural conditions, but their wild status 
is still uncertain today as they may be escapees from culti- 
vation [59]. Similarly, in the long-lived, late-flowering and 
disease-prone grapes, whOe non-flowering plants are ob- 
served both in the wild and in experimental breeding, it is 
very difficult to unequivocally establish whether these are 
neuter or just growing in flowering-limiting conditions. 

The length of the small XY region in Vitis vinifera is 
less than 1% of the chromosome length, much shorter 
than the small sex region identified in papaya which 
covers 10% of the chromosome [60]. In this small sex re- 
gion, the flavin-containing monooxygenase (EMO) genes 
and the adenine phosphoribosyl transferase (APT3) have 
been already suggested as good functional candidates for 
flower sex determination in grapevine [18]. Other candi- 
date genes could be mentioned such as the trehalose-6- 
phosphate phosphatase (TPP) that controls inflorescence 
architecture in maize through sugar signal modification 
[61] and its direct product, the disaccharide trehalose, 
has a marked effect on flowering transition [62]. The 
WRKY transcription factors are one of the largest fam- 
ilies of transcriptional regulators [63] and one of these 
factors has been shown to regulate endosperm growth and 
cellularization in Arabidopsis [64]. The VSW008 ampli- 
con was designed in a gene which the predicted protein 
reveals similarity with a Ethylene Overproducer-like 1 
(ETOl). The Arabidopsis ETOl protein specifically in- 
hibits the enzyme activity of the 1-aminocyclopropane-l- 
carboxylate synthase (ACS) [65,66] known to be involved 
in sex determination in melons (Cucumis meld) [67]. 
However, for the YABBY protein, the polymorphisms did 
not correlate with phenotypic sex, suggesting that the as- 
sociation found by Battilana et al. [20] is the result of an 
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extended intergenic linkage disequilibrium (LD), and not a 
direct indication of a causal mutation. 

Origin of the H allele and traces of domestication 

The last objective of this study was to elucidate the origin 
of hermaphroditism in domesticated grapevines. Both Fst 
and network analysis revealed that H haplotypes are more 
closely related to M than to F haplotypes. Thus, the H al- 
lele of the domesticated hermaphrodite grapevines may 
have derived from the M allele of wild male grapevines as 
suggested by previous authors [13,68]. Interestingly, in 
Carica papaya, hermaphrodites are also heterozygous for 
a Y chromosome variant (Y*^), more similar to the male- 
determining Y than to the X [60,69]. However, while all 
combinations of Y and/or Y*^ are lethal in Papaya, in Vitis 
HH genotypes do strive and set seeds, as in the case of 
certain domesticated grapevines such as Chardonnay, 
Muscat de Hambourg, Riesling or Cardinal (Truel pers. 
comm.. Vassal INRA), which produce 100% hermaphro- 
dite progenies. Thus, the H allele may be an M allele 
having lost the dominant female sterility mutation, 
explaining the dominance of the M allele over the H al- 
lele. This hypothesis could also explain the increase in 
diversity observed in the H haplotypes as compared to 
the M haplotypes. 

Studying phylogenetic patterns among the haplotypes, 
we could only found a weak tendency of the H macro- 
haplotypes of cultivars from Eastern regions cultivars, 
as compared to Western cvs, to be closer to the wild M 
macrohaplotypes. Former studies situated the major 
grapevine domestication region in the Eastern part of 
the Mediterranean area [9,53], which is thus consistent 
with our data. 

More interestingly, the network analysis showed that 
both the F and M/H haplogroups are each divided in 
subgroups. In particular, wild female macrohaplotypes 
are subdivided in two main groups, one closely con- 
nected to the V. balanseana F haplotype, and the other 
farther away from Vitis sp. females; domesticated female 
haplotypes are divided in three groups, the first one 
close to the V. balanseana group, and the other two 
branching as independent lineages from the main V. syl- 
vestris haplogroup. Similarly, in the M/H group, while 
the small differentiation within M haplotypes allows for 
less discrimination in the wild haplogroup, the cultivated 
hermaphrodites are again divided in groups, one includ- 
ing Eastern varieties and the other with a Western com- 
ponent. The general picture obtained with the network 
analysis points to a genetic structure of the wild V. vinifera 
haplotypes, in relation with other species, supporting the 
hypothesis presented in Peros et al. [25] that two chloro- 
plast lineages from different Asian species {V. piasezkii, V. 
amurensis and V. thunbergii) contributed to the emer- 
gence of wild V. vinifera populations in Europe. On the 



other hand, the group differentiation in the domesticated 
compartment, both for the F and the H haplotypes, sug- 
gests multiple domestication events, as advanced by 
Arroyo et al. [11] based on chloroplast genetic diversity. 
More surprisingly, we found that the H haplotype from cv. 
Assyl kara, a Russian cultivar, derives directly, via a series 
of mutations, from V. balansaena (Figure 5). In the F 
group, the cultivar closest to V. balanseana is also a 
Russian cultivar, cv. Ak ouzioum tapapskii. Based on this 
evidence, we can advance the hypothesis that, in the sex 
region, in addition to the already known contribution from 
V. vinifera ssp. sylvestris, domesticated grapes enclose a 
genetic contribution from different Asian species. It is his- 
torically known that during the Soviet Union period, Rus- 
sian agricultural researchers were active in importing 
genetic variability from diverse Asian regions as a source 
of cold or disease resistance alleles [70]. Indeed, Venuti 
et al. [71] recently showed that the Asian Vitis amurensis 
was used by breeders to introgress resistance genes into 
cultivated grapevines. However, since in our sample the 
cv. Assyl kara was recorded as one of the oldest traditional 
cultivar from North Caucasus [72], the introgression of a 
genetic contribution from Asian species into cultivated 
grapes may also significantly predate early 20th century 
breeding activities in Russia. It could well have occurred 
naturally through gene flow between different interfertile 
Vitis species followed by selection during domestication. 

The very small differentiation found between the H and 
M haplotypes in the sex-linked amplicons and the small 
number of individuals studied here makes it difficult to clar- 
ify further the domestication pathway; this issue merits 
without doubt further exploration, reinforcing the argument 
of Venuti et al. [71] that new prospecting and collection of 
wild grapes and other Vitis species in the Eastern part of the 
domestication range are strongly needed presently. 

The phylogeny position of V. balansaeana, V. coignetiae 
and V. monticola grapevines in our network, as well as 
segregation mapping in inter-specific crosses, both sup- 
port a sex locus shared by all Vitis spp. [7,17], suggesting 
that the development of heteromorphic sex chromosomes 
is still in the very first stage of evolution in this taxon. In 
general, the age of a sex-determining region can be esti- 
mated from the age of the taxon in which it is found [23] . 
As in the subgenus Vitis, dioecy is the ancestral condition, 
its sex-determining region should be at least as old as the 
separation of the Vitis and Muscadinia subgenera, thought 
to have diverged approx. 18 My ago [57]. Other dioecious 
species with a sex region of approximately the same age, 
such as Silene latifolia [73], Bryona dioica [74] or Rumex 
spp. [75], have reached the final stages of sex chromo- 
some evolution, with either full heteromorphic sex 
chromosomes or very large regions encompassing hun- 
dreds of genes. Future works to fully sequence the sex 
locus in a larger sample of genotypes in Vitis species 
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could contribute to understand why some dioecious 
plants rapidly developed specific sex chromosomes, 
while others did not. 



Conclusions 

In Vitis vinifera subsp. sylvestris, we confirmed a sex locus 
of 151,8 kb located downstream to the marker WIB23 
and displaying haplotype diversity, linkage disequilibrium 
and differentiation that typically correspond to a smaU XY 
sex-determining region with XY males and XX females. 
This small sex-determining region, spanning less than 1% 
of chromosome 2 and also present in other Vitis species, 
suggests that grapevines could be organisms of choice to 
study the early stages of evolution of sex chromosomes in 
perennial species. 

Hermaphrodite alleles appear to derive from male alleles 
of wild grapevines, with successive recombination events 
allowing import of diversity from the X into the Y chromo- 
somal region and slowing the expansion of the region into 
a full heteromorphic chromosome. Macrohaplotypes net- 
work patterns are consistent with a major grapevine do- 
mestication region in the Eastern part of the Mediterranean 
area and secondary domestication events in geographically 
distinct areas. Finally, we hypothesise that in the sex region 
some domesticated grapes enclose a genetic contribution 
from different Asian species. Our findings should encour- 
age new prospections and collection of wild grapes, in- 
cluding other Vitis species, in the Eastern part of the 
domestication range. 

Availability of supporting data 

The sequences data sets supporting the results of this 
article are available in the Genbank repository, [Gen- 
Bank: KJ575622-KJ57662; http://www.ncbi.nlm.nih.gov/ 
genbank/]". 
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